Frédéric Chyzak, Michael Drmota, Thomas
Klausner, and Gerard Kok
Abstract.
Let denote the set of unrooted labeled trees of size and let
be a particular (finite, unlabeled) tree.
Assuming that every tree of is equally likely, it is shown that
the limiting distribution as goes to infinity of the number of
occurrences of as an induced subtree is asymptotically normal
with mean value and variance asymptotically equivalent to
and , respectively, where the constants
and are computable.
INRIA-Rocquencourt, F-78153 Le Chesnay cedex, France, e-mail:
frederic.chyzak@inria.fr
Institut für Diskrete Mathematik und Geometrie, Technische
Universität Wien,
Wiedner Hauptstraße 8-10/113, A-1040 Wien, Austria, e-mail:
michael.drmota@tuwien.ac.at,
klausner@dmg.tuwien.ac.at
Delft Institute of Applied Mathematics, Delft University of
Technology,
Mekelweg 4, NL-2628 CD Delft, The Netherlands, e-mail:
gkok@fsmat.at
This research was supported by the Austrian Science Foundation
FWF, grants S8302 and S9604, and by the European Amadeus project.
1. Introduction
In this paper we consider unrooted labeled trees and analyse the
number of occurrences of a tree pattern as an induced subtree of a
random tree. It is well known that a typical tree in , the set of
unrooted labeled trees of size , has about nodes of
degree , where . Moreover, for any fixed the
total number of nodes of degree over all trees in satisfies
a central limit theorem with mean and variance asymptotically
equivalent to and (for a specific
constant ). See [DG99], where Drmota and
Gittenberger explored this phenomenon for unrooted labeled trees and
other types of trees.
A node of degree is an occurrence of what can be called a star
with edges. In this paper we continue this idea. We consider a
pattern , a given finite tree, and compute the limiting
distribution of the number of occurrences of in as
. Note also that there can be overlaps of two or more
copies of , which we intend to count as separate occurrences.
Our main result in this paper is:
Theorem 1.
Let be a given finite tree. Then the limiting distribution of
the number of occurrences of (as induced subtrees) in a tree of
is asymptotically normal with mean and variance asymptotically
equivalent to and , respectively, where
and depend on the pattern
and can be computed explicitly and
algorithmically and can be represented as polynomials (with rational
coefficients) in .
We consider here a random variable as Gaussian if its
characteristic function is given by , that is, the case of zero variance is
included here. For example, if consists just of one edge (and
two nodes), then the number of occurrences of in is
and thus constant. So in that particular case we have and
. Nevertheless we conjecture that in
all other cases.
As already mentioned, the case of stars (or nodes of given degree)
has been discussed in [DG99] for various classes of trees.
Some previous work for unlabeled trees is due to Robinson and
Schwenk [RS75]. Patterns in (rooted planar) trees have also
been considered by Dershowitz and Zaks [DZ89] under the
limitation that patterns start at the root. In a work on patterns
in random binary search trees, Flajolet, Gourdon, and Martínez
[FGM97] obtained a central limit theorem. Flajolet and
Steyaert also analysed an algorithm for pattern matchings in trees
[FS80a, FS80b, SF83]. Further Ruciǹski
[Ruc88] established conditions for when the number of
occurrences of a given subgraph in random graphs follows a normal
distribution.
The plan of the paper is as follows. In
Section 2 we give a short introduction to
counting trees with generating functions, and also expand this to two
variables for counting stars (nodes of specific degree ) in
trees. In Section 3 we expand this framework
to the counting of patterns in trees. The resulting asymptotics are
presented in Section 4, concluding the
proof of Theorem 1. Technical details
for this as well as explicit algorithms can be found in the appendix.
In fact, the algorithmic aspect is one of the driving forces of this
paper.
2. Counting Trees and Counting Stars in Trees
In this section we introduce a three-step program to count the
number of trees in and in the same fashion the number of
occurrences of nodes of degree in . While redundant and
probably heavy in this simplistic situation, this procedure was
crucial to the derivation in [DG99] for counting stars and will
generalise well to our setting of general tree patterns.
For this purpose we make use of the sets of rooted labeled
trees of size and of planted labeled trees of size .
For rooted and unrooted trees, the size counts the total number
of nodes, whether internal or at the leaves. On the other hand, a
planted tree is just a rooted tree where the root is adjoined an
additional “phantom” node which does not contribute to the size of
the tree, whereas the degree of the root is increased by one. As
well, one can think of a planted tree as a rooted tree with an
additional edge having no end vertex. The advantage of using planted
trees, though it seems to add complexity, will be explained below.
Obviously and . It is also well
known that and .
The three-step program is the following one: First, the generating
function enumerating planted trees is determined, then it is used to
count rooted trees by deriving their generating function, and finally
the generating function counting unrooted trees is computed.
We define
and proceed in the following way:
(1)
Planted Rooted Trees:
A planted tree is a planted root node with zero, one, two,
planted subtrees of any order. In terms of the generating function
this yields
(2)
Rooted Trees:
For rooted trees we get the same (except for the phantom nodes which
are not present here), just a root with zero, one, two,
planted subtrees of any order
(3)
Unrooted Trees:
Finally, we have , as already mentioned. However,
we can also express by a relation which
follows from a natural bijection between
rooted trees on the one hand and unrooted trees and pairs of planted rooted
trees (that are joined by identifying the additional edges at their planted
roots and discarding the phantom nodes) on the
other hand.111Consider the class of rooted (labeled) trees. If
the root is labeled by then consider the tree as an unrooted
tree. If the root is not labeled by then consider the first edge
of the path between the root and and cut the tree into two planted
rooted trees at this edge. This yields
The functional equation for can be either used to extract
the explicit number via Lagrange inversion or to obtain
the radius of convergence and asymptotic expansions of the singular
behaviour of this function. It is well known that is the common
radius
of convergence
of , , and , and that the singularity
at is of square-root type:
This is reflected by the asymptotic expansions of the numbers
In order to demonstrate the usefulness of the three-step procedure above
we repeat the same steps for
counting stars with edges in trees, that is, the number of nodes of
degree , a given fixed positive number.
Let denote the number of planted trees of size
with exactly nodes of degree . Furthermore, let and
be the corresponding numbers for rooted and unrooted trees and
set
Unrooted Trees:
Similarly to the above we have which is
sufficient for our purposes. However, as above,
it is also possible to express by
Note that the use of the notion of planted trees is crucial in order
to keep track of the nodes of degree by means of the recursive
structure of planted trees.
In [DG99] this approach was used to show that the asymptotic
distribution of the number of nodes of degree in trees of size is
normal,
with expectation and variance proportional to .
3. Counting Patterns in Trees
We now generalize the counting procedure of
Section 2 to more complicated
patterns. For our purpose, a pattern is a given (finite unrooted unlabeled)
tree . To ease explanations, we will use as the example
graph in Figure 1.
Figure 1. Example pattern
We say that a specific pattern
occurs in a tree if occurs in as an induced
subtree in the sense that the node degrees for the internal
(filled) nodes in the pattern match the degrees of the corresponding nodes
in , while the external (empty) nodes match nodes of
arbitrary degree.222More generally we could also consider
pattern-matching problems
for patterns in which some degrees of certain possibly external “filled” nodes
must
match exactly while the degrees of the other, possibly internal “empty” nodes
might be
different. But then the situation is more involved, see
Section 5. Because the results for the patterns
consisting of only one node or two nodes and one edge are trivial,
we now concentrate on patterns with at least three nodes.
Our principal aim is to get relations for the generating functions which count
the number of occurrences of a specific pattern .
Let denote the number of planted rooted trees with
nodes and exactly occurrences of the pattern and let
be the corresponding generating function.
3.1. Generating Functions for Planted Rooted Trees
Proposition 1.
(Planted Rooted Trees)
Let be a pattern. Then there exists a certain number of
auxiliary functions with
and polynomials with
non-negative coefficients such that
(1)
Furthermore,
where means that all Taylor coefficients of the left-hand
side are smaller than or equal to
the corresponding coefficients of the right-hand side.
Moreover, the dependency graph of this system is strongly
connected.333The notion of dependency graph is explained
in Appendix B and intuitively speaking, reflects
the fact that no
subsystem can be solved before the whole system.
The proof of this proposition is in fact the core of the paper. In
order to make the arguments more transparent we will demonstrate
them with the help of the example pattern in Figure 1. At
each step of the proof we will also indicate how to make all
constructions explicit so that it is possible to generate
System (1) effectively.
Figure 2. Planted pattern matching
In a first step we introduce the notion of a
planted pattern. A planted pattern is just a planted
rooted tree where we again distinguish between internal (filled) and
external (empty) nodes. It matches a planted rooted tree from if
occurs as an induced subtree starting from the (planted) root,
that is, the branch structure and node degrees of the filled nodes
match. Two occurrences may overlap. For example, in
Figure 2 the planted pattern on the left
matches the planted tree twice (following the left, resp. the
right edge from the root), but not at all. Also remark that,
notwithstanding the symmetry of , the pattern really matches
twice, as we are interested in matches in labeled trees.
Figure 3. Planted patterns for the pattern in Figure 1
We now construct a planted pattern for each internal (filled) node
of our pattern which is adjacent to an external (empty) node.
The internal (filled) node is considered as the planted root and one
of the free attached leaves as the plant. In our example we obtain
the two graphs in Figure 3.
The next step is to partition all planted trees according to their
degree distribution up to some adequate level. To this end, let
denote the set of out-degrees that occur in the planted patterns
introduced above and be the maximal height of these patterns. In our
example we have and . For obtaining a partition, we
more precisely consider all trees of height less than or equal to
with out-degrees in . We distinguish two types of leaves in these trees,
depending on the depth at which they appear:
leaves in level , denoted “”, and leaves at levels
less than , denoted “”. For our example we get 11 different
trees , depicted on Figure 4.
Figure 4. Tree partition
These trees induce a natural partition of all planted trees for the
following interpretation of the two types of leaves: We say that a
tree is contained in class444By abuse of notation the tree
class corresponding to the finite tree is denoted by the same
symbol . if it matches the finite tree (or pattern)
in such a way that a node of type has degree not in , while
a node of type has any degree. For example,
corresponds to those planted trees where the out-degree of the
root is not in .
It is easy to observe that these (obviously disjoint) classes of
trees form a partition. Indeed, take any rooted tree. For any path
from the root to a leaf, consider the first node with out-degree not
in , and replace the whole subtree at it with . Then
replace any node at depth with . The tree obtained in
this way is one in the list.
Furthermore, the classes above can be described
recursively. To this end, it proves
convenient to introduce a formal notation to describe operations
between classes of trees: denotes the disjoint union of
classes; denotes set difference; recursive descriptions of tree
classes are given in the form , to express that the class is constructed
by attaching subtrees from the class , subtrees
from the class , etc, to a root node that we denote .
In our example we get the following relations:
This is to be interpreted as follows. Trees in consist of a
(planted) root that is denoted by that has out-degree , and two
children that are of out-degree distinct from , that is, in .
Similarly, trees in consist of a root with out-degree
and subject to the following additional constraints: one subtree at the root is
exactly of
type ; the other subtree, call it , is of out-degree 2,
either with both subtrees
of degree other than (leading to in ), or with one subtree of degree and the other of
degree other than (leading to in ), or
with both of its subtrees of degree (leading to in
class ). Summarizing:
. Of course
this can be also interpreted as .
Another more involved example corresponds to ; here both subtrees
are of the form .
To show that the recursive description can be obtained easily in
general, consider a tree obtained from some planted
pattern . Let , …, denote its subtrees at the
root. Then, in each , leaves of type can appear only
at level . Substitute for all such either or a
node of out-degree chosen from and having for all its
subtrees. Do this substitution in all possible ways. The collection
of trees obtained are some of the ’s, say ,
, etc. Thus, we obtain the recursive relation
for .
In general, we obtain a partition of classes
and corresponding recursive descriptions, where
each tree type can be expressed as a disjoint union of tree
classes of the kind
(2)
where denotes the degree of the root of and the non-negative
integer is the number of repetitions of the tree type .
We proceed to show that this directly leads to a system of
equations of the form (1), where
each polynomial relation stems from a recursive equation between
combinatorial classes.
Let be the set of tuples with the property that
if and only if the term of type
(2) is involved in the recursive description
of (in expanded form). Further, let
denote the number of additional occurrences of the pattern
in (2) in the following sense: if
and is a (planted rooted) labeled tree of
with subtrees , , etc, and
occurs times in , times in , etc, then
contains exactly times. The
number corresponds to the number of occurrences of in
in which the root of occurs as internal node of the pattern.
By construction of the classes this number only depends on
and not on the particular tree . Let us clarify the calculation of with an
example. Consider the class of the partition for the example pattern. Now, in order to determine the number of additional occurrences, we match the planted
patterns of Figure 3 at the root of an
arbitrary tree of class . The left planted pattern of
Figure 3 matches three times, the right one
matches once. Thus we find that in this case .
For the other classes we find the following values of
:
Terms of class
Value of
0
0
0
1
2
1
2
3
3
4
5
.
Now define series by
These are in fact polynomials for by the finiteness of
the corresponding . All matches of the planted
patterns are handled in the , , thus
does not depend on .
In our pattern we get for example for
Finally, let denote the number of planted rooted trees
of type with nodes and occurrences of the pattern
and set
By this definition it is clear that
because the size of labeled trees is counted by (exponential generating function)
and the occurrences of the patterns is additive and counted by .
Hence, we explicitly obtain the proposed structure of the system of functional
equations (1).
For the example pattern we arrive at the following system of
equations, where we denote the generating function of the
class by the same symbol :
In order to complete the proof of Proposition 1 we just
have to show that the dependency graph is strongly connected. By
construction, depends on all functions
. Thus, it is sufficient to prove that every
() also depends on . For this purpose consider
the subtree of that was labeled by and consider a path
from its root to an empty node. Each edge of this path corresponds
to another subtree of , say , . Then, by construction of the system of functional
equations above, depends on , depends on
etc. Finally the root of is adjacent to an empty
node and thus (the corresponding generating function) depends on
. This completes the proof of Proposition 1.
Note that we obtain a relatively more compact form of this system by introducing
(3)
together with the recursive relations
The combinatorial classes corresponding to the (which we will
also denote by ) have the interpretation shown in Figure
5. We could have obtained the classes directly by restraining the
construction to a maximal depth instead of . In principle,
we could then apply the analytic treatment of
Section 4 to the system of the .
However we feel that the existence of a recursive structure of the
system of the with a well-defined for each
term in the recursive description is slightly less clear. Therefore
we preferred to work with the which have a
well-defined . In Appendix A we will
discuss another algorithm that yields in general even more compact
systems of equations.
Figure 5. The classes corresponding to the of equations (3)
3.2. From Planted Rooted Trees to Rooted and Unrooted Trees
The next step is to find equations for the exponential generating
function of rooted trees (where occurrences of the pattern are marked
with ). As above we set
where denotes the number of rooted trees of size with
exactly occurrences of the pattern . (That is, occurrences
of the rooted patterns deducible from . Here, a
rooted pattern is defined in a very similar way as a planted
pattern.)
Proposition 2.
(Rooted Trees)
Let be a pattern
and let
denote the auxiliary functions introduced in Proposition 1.
Then there exists a polynomial
with non-negative coefficients satisfying , and such that
(4)
for
(5)
Proof.
The proof is in principle a direct continuation of the proof of
Proposition 1. We recall that a rooted tree is just a root
with zero, one, two, planted subtrees, i.e., the class of
rooted trees can be described as a disjoint union of classes of
rooted trees of the form . Furthermore, let
denote the number of classes in this term such that , and set to be
the number of additional occurrences of the pattern . This
number again corresponds to the number of occurrences of in a
(rooted) tree in which the root of occurs as internal node of
the pattern.
Set
Then by construction
Note that . Let denote the set of degrees of the
internal (filled) nodes of the pattern, that is, ; then
does not depend on if . With
we obtain (4) and (5). The number is well-defined for a similar reason as was
, and can be calculated similarly.
∎
Figure 6. Rooted patterns for the pattern in Figure 1
We again illustrate the proof with our example. In
Figure 6 the corresponding rooted patterns are
shown. For convenience let denote the function
where . The function might also be
interpreted as a
catch-all function for the “uninteresting” subtrees—just a
root with an unspecified number of planted trees attached,
except the ones we handle differently, namely the cases . The generating function for rooted trees is then
given by
As above we have , where denotes the
number of unrooted trees with nodes and exactly occurrences
of the pattern . This relation is sufficient for our purposes. It
is also possible to express the corresponding generating function
. In a way similar as before, we can define the number of
additional occurrences of the pattern that
appear by constructing an unrooted tree from two planted trees of
the class and by identifying the additional edges at
their planted roots and discarding the phantom nodes. For our
example we get
4. Asymptotic Behavior
Since we are not interested in the actual number of occurrences of
the pattern, but only in its asymptotic behavior, we do not have to
compute explicit formulae from the system of equations. Instead, we
apply a result slightly adapted from [Drm97] which we state and
discuss in Appendix B. In fact, it is
immediately clear that Theorem 2 in this appendix,
whose object is the proof of Gaussian limiting distributions,
applies to the kind of problem we are interested in: the assertions
of Propositions 1 and 2 exactly fit the
assumptions of Theorem 2.
The only missing point is the existence of a non-negative solution
of the system
(6)
(7)
where (6) is the system of functional equations
of Proposition 1
and is the Jacobian matrix of .
Since the sum of all unknown functions is known for
:
it is not unexpected that .
Proposition 3.
There exists a unique non-negative solution of
System (6–7), for which and the components of
are polynomials (with rational coefficients) in .
Proof.
For a proof, set and consider the solution . Since the dependency graph is
strongly connected it follows that all functions have the
same radius of convergence which has to be , and all
functions are singular at . Since for it also follows that is
finite, and we have . If we
had the inequality then the implicit function theorem would imply the existence of
an analytic continuation for around , which is, of
course, a contradiction. Thus, the determinant is zero and system (6–7) has a unique solution.
To see that the components (with ) of
are polynomials in we will construct the partition
on which the system of equations (6–7) is based by refining step by step the
trivial partition consisting of only one class . The recursive description of this trivial
partition is given by the formal equation .
Additionally, the solution of the corresponding equation for the generating function
(denoted by the same symbol ) is given by , with clearly a
(constant) polynomial in . Now let again denote
the set of out-degrees that occur in the planted patterns. We will refine by introducing
for each a class consisting of all trees of root out-degree , as well as a class for trees with root out-degree not in . The partition
has the recursive description
(8)
and the solution of the corresponding system of equations
(9)
is given by
(10)
thus again polynomials in . We continue by refining this last partition by introducing
classes (for some ) for each term at the right-hand side of
(8) after expanding the “multinomial”. Such a class is of the form
with
natural numbers . We get a
new partition which has a recursive description by
construction (because we can replace the by disjoint unions of certain ).
The corresponding system of equations for the generating functions is given by
and consequently we have for the solution
with the of (10). Thus the are again polynomials in .
By continuing this procedure until level (i.e., performing the refinement step times)
we end up with the partition and we see that the solution for the corresponding
system of equations consists of polynomials in , which completes
the proof of Proposition 3.
∎
Note that there is a close link with Galton–Watson branching processes.
Let denote a Poisson offspring distribution. Now we interpret a class
as the class of process realizations for which the (non-planar) branching structure at the
beginning of the processes corresponds to the root structure of .
Then is just the probability of this event.
We now solve the system of equations obtained for the example pattern.
We have . The components of can easily be obtained by following the
construction of the proof of Proposition 4 (or we use
the branching process interpretation). For example, if we set for the
probability of an out-degree and then we get
. The factor
comes from the fact that the two subtrees of the root may be interchanged,
see Figure 4. The other classes can be treated similarly and we find:
(11)
We are now ready to complete the proof of the main part of Theorem 1.
By Propositions 1–3 we can apply
Theorem 2 and it follows that the numbers
have a Gaussian limiting
distribution with mean and variance which are proportional to .
Since we get exactly the same law for unrooted trees.
It remains to compute and .
By using the procedure described in Appendix B we get for our expample pattern
and
We observe—as predicted by Theorem 1—that both and can be written as
rational polynomials in .
In what follows we will prove this fact (which completes the proof
of Theorem 1) and also present an easy formula for .
Unfortunately the procedure for calculating is much
more complicated so that it seems that there is no simple formula.
Proposition 4.
Let and be given by Proposition 3 and
let be the polynomials of
Proposition 1, with . Then (of
Theorem 1) is a polynomial in with rational coefficients and is given by
(12)
Proof.
Let be the system of functional equations of
Proposition 1. In Appendix B the
following formula for the mean is derived:
(13)
Here denotes a positive left eigenvector of
, which is unique up to scaling.
From the equality
we get, after denoting
with ,
(14)
Since we have
. Consequently the sum
of all rows of equals for . Thus, denoting the transpose of a vector by , the vector is the unique
positive left eigenvector of , up to scaling.
It is now easy to check that
and that
The fact that is a polynomial in is now a direct consequence
from the fact that consists of
polynomials in and the fact that the coefficients are rational follows
from the fact that has rational coefficients.
∎
Of course, with help of (12) we can easily evaluate
directly. As already indicated it seems that there is
no simple formula for .
Before proving Proposition 5 we state in interesting fact
that will be used in the sequel.
Lemma 1.
Let the partition of that is used in the proof
of Theorem 1. Then
Since the proof is a rather lengthy computation we postpone it to
Appendix C.
Proposition 5.
Let and be given by Proposition 3. Then (of
Theorem 1) is a polynomial in (with rational coefficients).
Proof.
From the proof of Proposition 4 we already know that can be represented
as a polynomial in (with rational coefficients). The next step is to show that
has the same property. For this purpose we have to look at the
system (30)
where . We first observe that
Hence, we can replace the first row of the -matrix
(that is redundant since the
matrix has rank ) by the row and obtain
a regular linear system for . Note that all entries of
the right-hand side of this linear system can be represented as
polynomials in .
Let denote the matrix obtained from by replacing
the first row by . If follows from the
proof of Lemma 1 that .
Further all entries of can be represented as
polynomials in . Thus, has the same
property and consequently has this property, too.
From that it directly follows from (31) that is also
represented as a polynomial in . (By definition, is
a rational polynomial of the entries of .)
With help of (23) this finally leads to a representaion
of as a polynomial in .
∎
In what follows we list some obvious and some less obvious extensions
of our main result. For the sake of conciseness we do not present the
details.
5.1. Several Patterns
Let , , be different patterns. Then the problem
is
to determine the joint (limiting) distribution of the number of occurrences of
, , in trees of size . Using the same
techniques as above (introducing the forest of planted patterns
deduced from the patterns) we
again obtain a system of functional equations. The only difference is that
we now have to count occurrences of , , with
different variables , which is done in the same
fashion as for a single . In view of Theorem 2,
multiple variables make no difference and we obtain a multivariate
Gaussian limiting distribution.
5.2. Patterns Containing Paths of Unspecified Length
It might also be interesting to consider patterns where specific
edges can be replaced by paths of arbitrary length. It turns out
that this case in particular is more involved since a natural
partition of all planted rooted trees is now infinite.
Nevertheless it is possible to replace infinite series of such
classes by one new class and end up with a finite system.
Thus, this leads to a Gaussian limit law (as above).
5.3. Filled and Empty Nodes
In our model we have distinguished between internal (filled) and
external (empty) nodes of the pattern , where the degrees of
the internal (filled) nodes have to match exactly. It also seems to be
possible to consider the following more general matching problem:
Let again be a finite tree, where certain nodes are “filled”
and the remaining ones are “empty”. Now we say that matches
if it occurs as a subtree such that the corresponding degrees of the
filled nodes are equal whereas the degrees of the empty nodes might be
different. It seems that the counting procedure above can be adapted to cover
this case, too. However, it is definitely more involved.
For example, if leaves of the pattern are filled nodes then these
nodes have to be leaves wherever the pattern occurs.
This implies that some of the functions are then explicitly given
in the system and the dependency graph is not strongly connected.
However, it seems that this situation can be managed by eliminating these functions.
Furthermore, and this is more serious, in general one has
to consider infinitely many classes of trees leading to an
infinite system of functional equations, in particular if an internal node is “empty”.
In such a case Theorem 2 cannot be applied
any more. Nevertheless we hope that the approach of Lalley [Lal],
that is applicable to infinite systems of functional equations in one
variable, can be generalized to a corresponding generalization of
Theorem 2 to proper infinite systems. Thus, we
can expect a Gaussian limit law even in this case.
In order to be more precise we will present an easy example. Let
denote the pattern depicted in Figure 7. Here all
nodes are empty. Thus, the corresponding pattern counting problem
is a subgraph counting problem.
Figure 7. Example pattern with empty nodes
We partition all planted trees according to their root degree.
Let denote the set of planted rooted trees with root out-degree
and the correponding generating function (that also counts
the number of subgraph occcurences of ). Further, let
denote the generating function of rooted trees. Then we have
and
This system is easy to solve for . Here we have
and . By taking derivatives
with respect to and summing over all we also get (after some algebra)
This implies that the average value of pattern occurences (in this sense) is of the form
, that is, . In principle it is also possible to get asymptotics for
higher moments but the calculations get more and more involved.
5.4. Simply Generated Trees
Simply generated trees have been introduced by Meir and Moon [MM78]
and are proper generalizations of several types of rooted trees.
Let
be a power series with non-negative coefficients; in particular
we assume that and for some .
We then define the weight of a finite rooted tree
by
where denotes the number of nodes in with successors.
If we set
then the generating function
satisfies the functional equation
In this context, denotes a weighted number of trees of size .
For example, if for all (that is, ) then all rooted trees
have weight and is the number of planted plane
trees.
If (that is, )
then we formally get labeled rooted trees, etc.
Of course, we can proceed in the same way as above and obtain a
system of functional equations that counts occurrences of a specific
pattern in simply generated trees, and (under suitable conditions on
the growth of ) we finally obtain a Gaussian limiting
distribution. This has explicitly been done by Kok in his thesis
[Kok05a, Kok05b].
5.5. Unlabeled Trees
Let denote the number of unlabeled planted rooted trees and the
number of unlabeled unrooted trees. The generating functions are denoted by
The structure of these trees is much more difficult than that of labeled trees.
It turns out that one has
to apply Pólya’s theory of counting and an amazing observation (15)
by
Otter [Ott48].
The generating functions and satisfy the functional
equations
and
(15)
where denotes the cycle index of the symmetric group
.
These functions have a common radius of convergence and
a local expansion of the form
and
where and , and
is the only singularity on the circle of convergence .
Thus, they behave similarly as and .
We also get
and
Furthermore, it is possible to count the number of nodes of specific
degree with the help of bivariate generating functions (compare with
[DG99]). Thus, using Pólya’s theory of counting we can also
obtain a system of functional equations for bivariate generating
functions that count the number of occurrences of a specific
pattern. The major difference to the procedure above is that this
system also contains terms of the form for .
Fortunately these terms can be considered as known functions when
varies around the singularity and varies around
(compare again with [DG99]). Hence, Theorem 2
applies again and we can proceed as above. This has explicitly been
done by Kok in his thesis [Kok05a, Kok05b].
5.6. Forests
First, let us consider the case of labeled trees with generating function
. Then the generating function of unlabeled forests is
given by
Thus, the singular behaviour of is the same as that of
(compare with [DG99]) and consequently we again obtain a
Gaussian limiting distribution for the number of occurrences of a specific
pattern in labeled forests.
The case of unlabeled forests is similar. Here we have
Of course, we can consider other classes of trees or forests
of a given number of trees.
5.7. Forbidden Patterns
It is also interesting to count the number of trees
of size without
a given pattern. The generating function of these numbers is
just , resp. . It is now an easy exercise to
show that there exists an such that
The only thing we have to check is that the radius of convergence
of is larger than the radius of convergence of .
However, this is obvious since the radius of convergence of
(which is the same as that of ) is given by
(for around ) and .
Appendix A Algorithms
In the main part of this paper we showed that the limiting
distribution of the number of pattern occurrences is normal with
computable and . However the family of classes
considered in the first part was especially
created to make the arguments more transparent, there were no
considerations about minimality. In this appendix we focus on
creating another partition of
which has considerably less classes. It also has the properties that it
is recursively describable and allows an
unambiguous definition of the number of additional occurrences
of the pattern.
For example we show that for the pattern of Figure 9 we need
just 8 equations whereas the previous proof would use more than 1000 equations.
First we remark that in some
cases it is profitable to adjust the structure of the system
of equations (1) in Proposition 1 by
allowing an additional polynomial in the
first equation. The first equation then becomes
This system still fits our analytical framework. The advantage is
that for example the minimal system of equations
for counting stars in trees on page 1 now fits
this
modified system.
The idea for constructing will be to create in a first
time a certain family of tree classes , not necessarily building a partition of . Each
of these classes will be defined as the class of all trees in
which “start” in a certain way, or with other words, which match a
certain tree at the root, just as was the case for the
in the main part of this paper. By abuse of notation we will usually
write instead of for this tree. Let
and . Now, by collecting in
all different, non-empty classes of the form
(16)
we will obtain a partition of . This partition will
have a recursive description by construction, see the algorithms
below. Furthermore, if is sufficiently rich, this
partition will allow an unambiguous definition of .
We now make some considerations about the properties that
should possess to make sure that will
allow an unambiguous definition of . Let be a
subclass of . For each tree we can determine the number
of pattern occurrences at the root of . Let
. Because the patterns have finitely
many nodes
and because in each internal node the degree is fixed and the root
has to be part of the match, there are only finitely many ways for a
pattern match. Thus the set will be finite and non-empty.
Now let defined by equation (16) (and non-empty). Now it holds that
(17)
because a tree in is by definition in and
, thus the number of pattern occurrences
at the root is constrained by and . If is
sufficiently rich, then will only consist of a single
number. This will be the case if for each , the
family contains
all classes of trees “starting” with all possible arrangements of
overlapping patterns. Indeed, if we have for example for a certain
tree class that , then there will be
another tree class , which is a subclass of with
. Now the intersections and
will yield tree classes with a singleton ,
namely and .
For example consider a pattern which consists of a node of degree 2
attached to a node of degree 3. The corresponding planted patterns are shown
in Figure 8. Now let consist of the three classes
, shown in the center of Figure 8. We have , because the left planted pattern surely matches and the other does not, , because the left planted pattern does not match and the right one matches at least once, but possibly twice. , because the left pattern does not match and the right one surely matches twice. We see that the only non-empty intersections of the form (16) are , and . We obtain and , which are all singletons.
Figure 8. On the left: Planted patterns. Center: Classes . Right: Classes
. The white box here means a node of out-degree different from 1. Note:
this does not correspond to the output of the algorithms of this appendix
Because we also need a recursive description of the final partition ,
we will construct some additional tree classes . As the partition becomes
finer when dealing with more classes , it is clear that remains well-defined.
On the other hand we do not have to associate a unique number to ,
only to . Therefore we can slightly reduce the family
. In the algorithm below this reduction
of corresponds to considering only proper subtrees of the trees
( itself is excluded).
A coarse-grain description of an algorithm now follows.
(1)
Calculate the set of all planar embeddings of all planted patterns
deducible from the pattern .
(2)
Consider the planted planar
trees issue of step 1 as planar tree classes and
take all possible intersections of any number of those classes. Now take the implied non-planar general tree structure of each class and collect these non-planar planted trees in the set .
(3)
Create a family for the
forest of planted subtrees of trees , excluding the trees
themselves, where each has a recursive description in and where denotes a leaf.
(4)
Now interpret as the class of all trees and interpret the trees as non-planar tree classes. Construct a partition of the class of all planted trees together with a recursive description (compare with (16)).
(5)
Calculate for each term in the recursive description the
number of additional pattern occurrences and
deduce a system of equations for the generating functions
of the classes .
Before giving more detailed algorithms, we give an example.
Consider the pattern of Figure 9.
Figure 9. Example pattern
With the procedure of the main part of the article we would end up
with more than 1000 classes, yielding a system of equations with the
same number of equations. However, by using the following refined algorithm we only need 8
classes.
In the first step we create all planar embeddings of the corresponding planted pattern (trees of Figure 14). This yields planar
trees of which some are shown in Figure 10.
Figure 10. Some of in total 16 planted planar embeddings
We now consider these structures as planar tree classes and
additionally construct tree classes by taking all possible
intersections of any number of the classes issued from step 1. Then, we take the
non-planar implied tree structure of each planar class and collect these
trees in . We end up with 24
different trees: 9 that stem from , 1 from , and 14 from . Some of
them are shown in Figure 11.
Figure 11. Some of in total 24 non-planar trees of
For all proper subtrees for each tree in we now construct a recursive description.
For example, for the leftmost tree of Figure 11 we first consider the subtree consisting of a node with four
leaves. We denote this class by . (Here we use the following structural
notation: denotes a root node, a leaf and denotes a root to which are attached
4 leaves.) The next subtree is a root of out-degree 2 to which a subtree of type is attached.
We denote this with . Figure 12 shows all 6 trees we end up with. Observe on our example that the collection of subtrees at the root extracted from the 24 trees in consists of only 6 trees.
Figure 12. Non-planar trees which possess a recursive description
Their recursive descriptions are given by
(18)
We now interpret in (18) as the class of all planted trees .
The other are also interpreted as tree classes. For example, is
the class of all trees with root out-degree 3.
We now construct a partition based on these classes and their recursive
description of (18). We obtain the classes of Figure
13.
Figure 13. Non-planar partition classes. The white box means “not out-degree 3 or 4” and the white triangle means “anything that is not contained in
the other classes”
Their recursive description is given by
(19)
The last step consists of determining the number of additional
occurrences for each term in the recursive
description (19) and translating (19) in a
system of equations for the generating functions .
As an example we consider the equation for . Class consists
of the trees of root out-degree 3. We get no additional occurrences of the pattern if
we attach a tree of class or to such a root, we get
one additional occurrence for each tree of class or and we have two
additional occurrences for each tree of class attached to the root. This yields the equation
for below. Altogether we obtain:
We can now calculate . We get
. The computation of
was not feasible, because of memory
problems.555The actual computation uses polynomial expressions with more than 200,000 terms. We used Maple 9.5, which used up the memory of 1 GB and a very large part of the 1 GB swap.
A.1. Planar embedding algorithm: GeneralToPlanar
Input: a general planted tree
Output: the set of planted planar
trees that share as their implied general tree
structure
Algorithm:
(1)
write in the form , that is, let
be the root out-degree of and , …,
be the children at the root
(2)
for each between 1 and , recursively compute
(3)
construct and return the set of planar trees
over all choices
of and over all permutations
of
A.2. Tree class intersection algorithm
Input: a set of planted planar trees
Output: the set of non-planar planted trees which are obtained by intersecting planar
tree classes based on and collecting the non-planar tree
structures of the resulting planar tree classes.
Algorithm:
(1)
For each between 1 and , consider all
-tuples of different trees and
determine for each -tuple if may
be interpreted as a non-empty tree class. In that case, let be
the implied non-planar tree structure of and add to the set
.
A.3. DAGification algorithm
We construct a recursive description for the forest of planted
subtrees for each tree in a given set of planted trees. Here we
do not consider the tree itself as a subtree of itself. This calculation
is reminiscent of the DAGification process of computer science (see,
e.g., [ASU86]), which aims at compacting an expression tree by
sharing repeated subexpressions. However, if we interpret those
subtrees as classes, the intersection of two classes need not be
empty.
Input: set of planted trees
Output: a number and a recursive description of
the forest of planted subtrees of
the trees of , of the
form
with the constraint for all and
Algorithm:
(Initialization)
Introduce the exceptional type to
denote the planted tree consisting of a single node (in other words,
a leaf) and set to
(Main loop)
For all planted trees of
perform a depth-first traversal of the tree, starting from the
planted root; during this recursive calculation, at each
node :
(1)
if the node is a leaf, return the type
(2)
else, recursively determine the type associated with each child
of
(3)
If is a not the planted root of the tree,
write the subtree rooted at as a (commutative) product of the types obtained in the
previous step
(4)
look up the uniquification table to check whether this product
has already been assigned a type
(5)
if not existent, increment , create a new type , remember its
definition , and assign to the product in
the uniquification table.
(6)
return the type if it was found by lookup, otherwise return
(Conclusion)
Return and the sequence of definitions of the
form , for .
A.4. Disambiguating algorithm
The idea of the algorithm below is to consider each class of trees,
, in turn, introducing its defining equation
into the calculation, while maintaining (and refining) a partition
of the total class of planted trees. To be able to do so, it is
crucial that the recursive equation for refers to
classes with only, starting with the special
class , the full class of planted trees.
At any stage in the algorithm, the class of -ary trees is given
as the disjoint union of Cartesian products
where denotes the number of components in the tuple
. In the process of the algorithm below, each class
gets represented in a “polynomial” form like above, summed over a
subset of the set of integer sequences of a given length . Computing
intersections and differences of classes means merely computing
intersections and differences of the in their
representations, because of the recursive structure of the input and
of the algorithm itself.
Input:
•
A family of classes of trees with recursive descriptions of the form
with the constraint for all and
Output:
•
an integer implying a partition
•
a representation of each of the form
•
a recursive description of the of the form
being implicitly described as
Algorithm:
(Initialization)
Start with the trivial partition
for , the single representation , that is,
.
(Main loop)
For from 1 to do
(1)
replace each in the definition of with its current
representation in terms of the , expand, and set to the
result, so as to get a representation of of the form
(2)
for from 1 to while do
(a)
set to by setting
to
(b)
if , then do
(i)
set to
(ii)
if , then
(A)
create a new with description :
increment before setting to , that is,
before setting
to
(B)
split into in the representations
of the , that is, add into each set
containing
(C)
split into in the
descriptions of the , , and , that is, for
each sequence in each of the , ,
and , add sequences with replaced by
when the sequence involves (if occurs more than
once, then replace by or in all possible ways)
(D)
set to by setting
to
(iii)
set to , which is also , and update by setting it
to
(3)
if , then
(a)
create a new with
description : increment before setting to , that
is, before setting to
(b)
split into in the representations
of the , that is, add into each set
containing
(c)
split into in the descriptions of
the , that is, for each sequence in each of the ,
add sequences with replaced by
when the sequence involves (if occurs more than
once, then replace by or in all possible ways)
(4)
represent as the union of all those s that have
contributed a non-empty at step (2b)
and of if a new was created at
step (3a), that is, create the corresponding
set consisting of the contributing s, together with if
relevant
(Final step)
Return , the representations of the
for , the recursive descriptions of the
for
We will explicitly show the stages through which the algorithm goes
when running with the input . For readability, we will
keep expressions in factored form.
:
from , we derive and
.
:
from , we derive
and , where .
:
from , we derive and , where .
:
from , we
derive
and , where .
:
from , we derive and , where .
:
from , we derive and
,
where .
A.5. Calculation of : CountRootOccurrences
Input: non-planar planted trees ,
, …,
Output: the number of occurrences of any of
the at the root of
count and return the number of pairs such that
is element of and occurs at the root of
As an example we calculate . This corresponds to calculating the
number of additional occurrences in the class . The input trees
are shown in Figure 14. Here corresponds to the class
and correspond to the three possible ways of planting the example pattern.
Figure 14. Input trees
We take as fixed planar embedding of the embedding of Figure
14. We now iterate over the different
planar embeddings of (6 of them), of (2 of them), and of (8 of them), and determine for each () whether it occurs at the root
of . Consider for example the four embeddings shown in Figure 10 (three embeddings of , one embedding of ). The leftmost
embedding matches , the one next to it as well. The third one does not match , because the node with
out-degree four is in the wrong position. The rightmost embedding clearly does not match either. By considering all
embeddings and counting the matches we get .
The algorithm calculates the correct value of , because the partition consisting of the
classes is sufficiently fine. From this follows that every match above of a planar embedding really
gives rise to exactly one additional pattern occurrence. See the considerations made at the beginning of this
appendix.
By now the transformation to a systems of equations is easy. We get
the terms by replacing a term in the
recursive description of by a term . Here it is assumed that
terms that represent the same tree classes (like and
) are identified before. It is clear that there are
only finitely many terms for which might be
non-zero a priori.
Appendix B Asymptotics of Analytic Systems
The following theorem is a slightly modified version of the main
theorem from [Drm97]. We denote the transpose of a vector
by . Let be a column
vector of functions , , with complex
variables , , which are analytic around and
satisfy for . We are
interested in the analytic solution of the functional
equation
(20)
with , i.e., we demand that the (unknown)
functions , , satisfy
the system of functional equations
It is convenient to define the notion of a
dependency (di)graph for such a system of
functional equations .
The vertices
are just the unknown functions
and an ordered pair is contained in the edge set
if and only if really depends on .
If the functions have non-negative Taylor coefficients
then it is easy to see that the solutions have the
same property. (One only has to solve the system iteratively
by setting and for . The limit is the (unique) solution of the
system above.)
Now suppose that is another analytic function with
non-negative Taylor coefficients. Then has
a power series expansion
with non-negative coefficients . In fact, we assume that
for every there exists such that .
Let denote an -dimensional discrete random vector with
(21)
where
are the coefficients of
The following theorem shows that (under suitable analyticity conditions)
has a Gaussian limiting distribution.
Theorem 2.
Let be functions analytic
around , , ,
whose Taylor coefficients are all
non-negative, such that , , , and such that there exists with
. Furthermore assume that
the region of convergence of is large enough that there
exists a non-negative solution , of the system of
equations
inside it. Let
denote the analytic solutions of the system
(22)
with and assume that
for , where .
Moreover, let denote an analytic function with
non-negative Taylor coefficients such that the point is contained in the region of
convergence. Finally, let random vectors be
defined by (21).
If the dependency graph of the system
(22)
in the unknown functions
is strongly connected then the sequence of
random vectors
admits a Gaussian limiting distribution with mean value
and covariance matrix
The row vector is given by
and the matrix by
(23)
where (and ) is the solution of
the (extended) system
(24)
(25)
The proof of Theorem 2 is exactly the same as
that given in [Drm97]. The main observation is that
the assumptions above show that the solutions admit
a local representation of the form
(where is close to and close to ).
The assumption that the dependency graph is strongly connected
ensures that the location of the singularity of all functions is
determined by the common function . Thus, we get the same
property for :
(26)
It is then well known (see [BR83, Drm94]) that a
square-root singularity plus some minor conditions implies
asymptotic normality of the coefficients (in the sense introduced
above) with mean and covariance expressed in terms of derivatives of
. Note, for example, that the assumption for
ensures that for sufficiently large and from
this follows that is the only singularity on the
radius of convergence of .
In what follows we comment on the evaluation of and
. The problem is to extract the derivatives of .
The function is the solution of the system (24–25) and
is exactly the location of the singularity of the mapping when is fixed (and close to ).
Let and denote the solutions
of (24–25). Then we have
(27)
Taking derivatives with respect to we get
(28)
where the three terms in denote evaluations at
of the partial derivatives of ,
and where and denote the Jacobian of
resp. with respect to . In particular, for
we have and and, of course
Since is a non-negative matrix and the dependency graph is
strongly connected there is a unique Perron-Frobenius eigenvalue of
multiplicity 1. Here this eigenvalue equals 1. Thus,
has rank and has (up to scaling) a unique positive left
eigenvector :
The derivation of is more involved.
We first define as the (generalized) vector
product666More precisely this is the wedge product combined with
the Hodge duality. of the last columns of the matrix
. Observe that
In particular we have
Then from
(30)
we can calculate . (The first system has rank , this means that
we can skip the first equation. This reduced system
is then completed to a regular system by appending the second equation (30).)
where
denote the respective partial derivatives and where we omitted the
dependence on . With the knowledge of and
we can now evaluate at and we finally calculate from (23).
where denotes the matrix where we replace the
first row by .
Thus, it remains to prove that .
For this purpose we have to be more explicit with the partition .
More precisely we construct recursively from level to level.
This procedure is similar to that of Proposition 3 but not the same. In order to make our
arguments more transparent we restrict ourselves to 4 steps. Note that this procedure
also provides a recursive description of the polynomials .
One starts with , where and . This
means that collects all trees where the root out-degree is not contained in and those
where it is contained in . For example, if then the generating functions
of this (trivial) partition are given by and by .
Then we partition according to structure of the subtrees of the root, where
we distinguish between the previous classes and . We
get , where and .
In particular, if then , the class collects all trees
with root out-degree where both subtrees
of the root are in class , collects all trees with with root out-degree where
one subtree of the root is in class and the other one in class , and collects
those trees
where both subtrees of the root are in class . The corresponding generating functions
are given by , by , and by
. Of course, we also have and
.
In the same fashion we proceed further.
We partition () according to the structure
of the subtrees of the root (that are now taken from )
and denote them by .
Further we define sets by .
If then , , is divided into three parts, and
is divided into 6 parts: , , .777By the way this leads to the partition
that is used in the proof of Theorem 1 resp. of Proposition 1.
Finally, we partition
() according according to the structure
of the subtrees of the root that are taken from the and denote
them by .
As in the previous step we define sets by
.
In general we have to iterate this procedure until a certain level and
get almost the same partition as in the proof of Proposition 1. The only difference is that at the lowest level we only distinguish between nodes with degree in and degree not in . However this is no real restriction as we can extend the partition above with an additional level and we will have a well-defined number of additional occurrences for each class. We again obtain a partition which fits Proposition 1.
We recall that this recursive procedure directly provides
a recursive description of the system of functional equations. In particular
we have
where can be actually written as a polynomial in .
Next
where can be written as a polynomial in . Further,
In other words, the sum can be written as polynomial in .
Finally,
where can be written as a polynomial in and and
we have
Let denote the -submatrix of where we omit the first row and column.
Then has the following structure:
where
and
The condition that can be written as a polynomial in implies that
for all , that is,
each row of is either zero or all entries are the same.
Further, if we fix and sum over all rows then
we get
Since can be written as a polynomial in () we have
for all , where we set .
Similarly if we fix and sum over all rows then
we get
Since can be written as a polynomial in and
we have for all .
Now we will calculate the determinant of the matrix
(By we denote an entry we do not care.)
We now perform the following row operations. For every
we substitute the first row of
by the sum of the corresponding rows . Since for all
this sum of the rows has the form
We now add the very first row (that equals ) times to this row and obtain
Next we fix and such that and substitute the first row of
by the sum of the rows . Since for every it holds that for all
this sum has the following form
where if and only if and otherwise.
This means, for every the entries are either all equal or
if then we have to add 1 at proper positions. For every we now add row
times. If then we get a zero block . If we get a block of
the form
This means that this row is replaced by
With help of these rows we can eliminate all further entries of that come
from . (Here we use the fact that each row of
is either zero or all entries are the same.) This means that we finally end up with a matrix of
the form
where for and
is of the form
with
It is now an easy task to transform the matrix
(with help of row transforms) to the identity matrix. Furthermore we can
transform the very first row of to and end up with a matrix of the form
Obviously, this matrix has determinant . Since the above row transforms do not change the value of the
determinant we, thus, obtain .
Acknowledgement. The authors want to thank Philippe Flajolet for
several discussions on the topic of the paper and for many useful hints.
References
[ASU86]
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman.
Compilers: principles, techniques, and tools.
Addison-Wesley Longman Publishing Co., Inc., 1986.
[BR83]
Edward A. Bender and L. Bruce Richmond.
Central and local limit theorems applied to asymptotic enumeration.
II. Multivariate generating functions.
J. Combin. Theory Ser. A, 34(3):255–265, 1983.
[DG99]
Michael Drmota and Bernhard Gittenberger.
The distribution of nodes of given degree in random trees.
J. Graph Theory, 31(3):227–253, 1999.
[Drm94]
Michael Drmota.
Asymptotic distributions and a multivariate Darboux method in
enumeration problems.
J. Combin. Theory Ser. A, 67(2):169–184, 1994.
[Drm97]
Michael Drmota.
Systems of functional equations.
Random Structures Algorithms, 10(1-2):103–124, 1997.
Average-case analysis of algorithms (Dagstuhl, 1995).
[DZ89]
Nachum Dershowitz and Shmuel Zaks.
Patterns in trees.
Discrete Appl. Math., 25(3):241–255, 1989.
[FGM97]
Philippe Flajolet, Xavier Gourdon, and Conrado Martínez.
Patterns in random binary search trees.
Random Structures Algorithms, 11(3):223–244, 1997.
[FS80a]
Philippe Flajolet and Jean-Marc Steyaert.
On the analysis of tree-matching algorithms.
In Automata, languages and programming (Proc. Seventh Internat.
Colloq., Noordwijkerhout, 1980), volume 85 of Lecture Notes in Comput.
Sci., pages 208–219. Springer, Berlin, 1980.
[FS80b]
Philippe Flajolet and Jean-Marc Steyaert.
On the analysis of tree-matching algorithms.
In Trees in algebra and programming (Proc. 5th Lille Colloq.,
Lille, 1980), pages 22–40. Univ. Lille I, Lille, 1980.
[Kok05a]
Gerard Kok.
Pattern distribution in various types of random trees.
Conf. Proc. Analysis of Algorithms, Barcelona, 2005.
[Kok05b]
Gerard J.P. Kok.
The distribution of patterns in random trees, thesis.
Institut für Diskrete Mathematik und Geometrie, TU Wien,
Austria, 2005.
[Lal]
S. Lalley.
Random walks on infinite free products and infinite algebraic systems
of generating functions.
Manuscript.
[MM78]
A. Meir and J. W. Moon.
On the altitude of nodes in random trees.
Canad. J. Math., 30(5):997–1015, 1978.
[Ott48]
Richard Otter.
The number of trees.
Ann. of Math. (2), 49:583–599, 1948.
[RS75]
R. W. Robinson and A. J. Schwenk.
The distribution of degrees in a large random tree.
Discr. Math., 12:359–372, 1975.
[Ruc88]
Andrzej Ruciński.
When are small subgraphs of a random graph normally distributed?
Probab. Theory Related Fields, 78(1):1–10, 1988.
[SF83]
Jean-Marc Steyaert and Philippe Flajolet.
Patterns and pattern-matching in trees: an analysis.
Inform. and Control, 58(1-3):19–58, 1983.